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blood vessels, neurosvascular structure in particular, we have also reviewed some of the 
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This personal overview of Interface '99 is intended to communicate its meaning and 
relevance to SIGKDD, as well as provide valuable information on trends within the 
Interface for data miners seeking to learn more about statistics. In addition, it is the 
newest link in a bridge between the Interface and KDD begun by References 2-4 and the 
sessions on KDD at Interface '98 and Interface '99. 
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A statistical profile summarizes the instances of a database. It describes aspects such as 
the number of tuples, the number of values, the distribution of values, the correlation 
between value sets, and the distribution of tuples among secondary storage units. 
Estimation of database profiles is critical in the problems of query optimization, physical 
database design, and database performance prediction. This paper describes a model of a 
database of profile, relates this model to estimating ... 
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We propose two methods that reduce the post-nonlinear blind source separation problem 
(PNL-BSS) to a linear BSS problem. The first method is based on the concept of maximal 
correlation: we apply the alternating conditional expectation (ACE) algorithm— a powerful 
technique from non-parametric statistics— to approximately invert the componentwise 
non-linear functions/The second method is a Gaussianizing transformation, which is 
motivated by the fact that linearly mixed signals bef ... 
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Statistical estimation and approximate query processing have become increasingly 
prevalent applications for database systems. However, approximation is usually of little 
use without some sort of guarantee on estimation accuracy, or "confidence bound." 
Analytically deriving probabilistic guarantees for database queries over sampled data is a 
daunting task, not suitable for the faint of heart, and certainly beyond the expertise of the 
typical database system end-user. This paper considers the prob ... 
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The next-generation astronomy digital archives will cover most of the sky at fine 
resolution in many wavelengths, from X-rays, through ultraviolet, optical, and infrared. 
The archives will be stored at diverse geographical locations. One of the first of these 
projects, the Sloan Digital Sky Survey (SDSS) is creating a 5-wavelength catalog over 
10,000 square degrees of the sky (see http://www.sdss.org/). The 200 million objects in 
the multi-terabyte database will have mostly numerical attribut ... 
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A survey of graphics developers on the issue of texture mapping hardware for volume 
rendering would most likely find that the vast majority of them view limited texture 
memory as one of the most serious drawbacks of an otherwise fine technology. In this 
paper, we propose a compression scheme for static and time-varying volumetric data sets 
based on vector quantization that allows us to circumvent this limitation. We describe a 
hierarchical quantization scheme that is based on a multiresolution c ... 
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Managing scientific data warehouses requires constant adaptations to cope with changes 
in processing algorithms, computing environments, database schemas, and usage 
patterns. We have faced this challenge in the RHESSI Experimental Data Center (HEDC), 
a datacenter for the RHESSI NASA spacecraft. In this paper we describe our experience in 
developing HEDC and discuss in detail the design choices made. To successfully 
accommodate typical adaptations encountered in scientific data management systems ... 
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^ Omit Y. Ogras, Hakan Ferhatosmanoglu 

November 2003 Proceedings of the twelfth international conference on Information 

and knowledge management 
Publisher: ACM Press 

Full text available: *g [pclfn93.50 KB) Additional Information: full citation , abstract , references , index terms 

High dimensional data sets are encountered in many modern database applications. The 
usual approach is to construct a summary of the data set through a lossy compression 
technique, and use this lower dimensional synopsis to provide fast, approximate answers 
to the queries. In this paper, we develop a novel dimensionality reduction technique 
based on partitioning the high dimensional vector space into orthogonal subspaces. First, 
we find a relation between the Euclidian distance of two n-dimensio ... 

Keywords: high dimensional data, shape approximation, similarity search 
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Manish Parashar, James C. Browne, Carter Edwards, Kenneth Klimkowski 

^ November 1997 Proceedings of the 1997 ACM/IEEE conference on Supercomputing 
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Publisher: ACM Press 

Full text available: '|| pdf(160.93 KB) Additional Information: full citation , abstract , references , citings 

This paper presents the design, development and application of a computational 
infrastructure to support the implementation of parallel adaptive algorithms for the 
solution of sets of partial differential equations. The infrastructure is separated into 
multiple layers of abstraction. This paper is primarily concerned with the two lowest 
layersof this infrastructure: a layer which defines and implements dynamic distributed 
arrays (DDA), and a layer in which several dynamic data and programming ab ... 

Keywords: HP-adaptive finite elements, adaptive mesh-refinement, distributed dynamic 
data structures, fast multipole methods, parallel adaptive algorithm, problem solving 
environment 
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Non-photorealistic rendering can be used to illustrate subtle spatial relationships that 
might not be visible with more realistic rendering techniques. We present a parallel 
hardware-accelerated rendering technique, making extensive use of multi-texturing and 
paletted textures, for the interactive non-photorealistic visualization of scalar volume 
data. With this technique, we can render a 512x512x512 volume using non-photorealistic 
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techniques that include tone-shading, silhouettes, gradient-base ... 

Keywords: interactive visualization, non-photorealistic rendering, parallel rendering, 
scientific visualization, silhouette, texture graphics hardware, visual perception, volume 
rendering 
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Publisher: ACM Press 

Full text available: ^ pdf(209.30 KB) Additional Information: full citation , abstract , references , index terms 

Dimension reduction is critical for many database and data mining applications, such as 
efficient storage and retrieval of high-dimensional data. In the literature, a well-known 
dimension reduction scheme is Linear Discriminant Analysis (LDA). The common aspect of 
previously proposed LDA based algorithms is the use of Singular Value Decomposition 
(SVD). Due to the difficulty of designing an incremental solution for the eigenvalue 
problem on the product of scatter matrices in LDA, there is little ... 

Keywords: QR decomposition, dimension reduction, incremental learning, linear 
discriminant analysis 
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_ 1 1 • i , u. ^ ma mm Additional Information: full citation , abstract , references , citings , index 
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Biotech companies routinely generate vast amounts of biological measurement data that 
must be analyzed rapidly and mined for diagnostic, prognostic, or drug evaluation 
purposes. While these data analysis tasks are critical to their success, they have not 
benefited from recent advances that emerged from database and KDD research. In this 
paper, we focus on two such tasks: on-line analysis of clinical study data, and mining 
broad datasets for biomarkers. We examine the new requirements that are no ... 
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simulation methods for risk analysis of collateralized debt obligations 

William J. Morokoff 

December 2003 Proceedings of the 35th conference on Winter simulation: driving 
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Full text available: ^£| pdf(411.42 KB) Additional Information: full citation , abstract , references 

Collateralized Debt Obligations (CDOs) are sophisticated financial products that offer a 
range of investments, known as tranches, at varying risk levels backed by a collateral 
pool typically consisting of corporate debt (bonds, loans, default swaps, etc.). The 
analysis of the risk-return properties of CDO tranches is complicated by the highly 
nonlinear and time dependent relationship between the cash flows to the tranche and the 
underlying collateral performance. This paper describes a multip ... 
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